Exponential Lower Bounds for Policy Iteration

نویسنده

  • John Fearnley
چکیده

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exponential Lower Bounds for Solving Infinitary Payoff Games and Linear Programs

Parity games form an intriguing family of infinitary payoff games whose solution is equivalent to the solution of important problems in automatic verification and automata theory. They also form a very natural subclass of mean and discounted payoff games, which in turn are very natural subclasses of turn-based stochastic payoff games. From a theoretical point of view, solving these games is one...

متن کامل

A subexponential lower bound for the Least Recently Considered rule for solving linear programs and games

The simplex algorithm is among the most widely used algorithms for solving linear programs in practice. Most pivoting rules are known, however, to need an exponential number of steps to solve some linear programs. No non-polynomial lower bounds were known, prior to this work, for Cunningham’s Least Recently Considered rule [5], which belongs to the family of history-based rules. Also known as t...

متن کامل

Capacity Bounds and High-SNR Capacity of the Additive Exponential Noise Channel With Additive Exponential Interference

Communication in the presence of a priori known interference at the encoder has gained great interest because of its many practical applications. In this paper, additive exponential noise channel with additive exponential interference (AENC-AEI) known non-causally at the transmitter is introduced as a new variant of such communication scenarios‎. First, it is shown that the additive Gaussian ch...

متن کامل

Subexponential lower bounds for randomized pivoting rules for solving linear programs

The simplex algorithm is among the most widely used algorithms for solving linear programs in practice. Most deterministic pivoting rules are known, however, to need an exponential number of steps to solve some linear programs. No non-polynomial lower bounds were known, prior to this work, for randomized pivoting rules. We provide the first subexponential (i.e., of the form 2 α), for some α > 0...

متن کامل

On the Complexity of Policy Iteration

Decision-making problems in uncertain or stochastic domains are often formulated as Markov decision processes (MD Ps). Pol­ icy iteration (PI) is a popular algorithm for searching over policy-space, the size of which is exponential in the number of states. We are interested in bounds on the complexity of PI that do not depend on the value of the discount factor. In this paper we prove the first...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010